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ABSTRACT 

This module, the third of four units about vocational 
competency measurement, is module 19 in the Vocational Education 
Curriculum Specialist series. The purpose stated for the document is 
to help in the development of both written and performance competency 
tests based on the needs identified for testing a particular program 
and the standards and priorities established for job-related tasks. 
Content is organized into four sections, each of which focuses on one 
goal and two or more objectives. Section 1 summarises important 
considerations in vocational competency-test development, including 
test validity, reliability, practicality, and testing individual $? 
with special needs. In section 2, important considerations in- 
designing initial test specifications are overviewed. Sections 3 and 
4 discuss the critical tasks in developing paper-and-pencil and 
performance tests. Each section concludes with individual study 
activities, discussion questions, and group activities. Self-check 
items and possible responses to them are appended for use as a 
pretest and review of the module content.' (YLB) 
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Introduction 



This is the third of four modules dealing with the use, 
development, and validation of vocational competency tests. 
Earlier modules provided an overview of using- competency mea- 
sures in vocational education programs (Module 17) and a dis- 
cussion of how to determine requirements for vocational ^compe- 
tency measures (Module 18). The last module (Module 20) con- 
siders approaches to validating competency tests and using 
test res.ult,s. 

The purpose of this module is to help you develop compe- 
tency tests — both written and performance — based on the needs 
you've identified for testing a particular program and the 
standards and priorities you've established for job-related 
tasks. The techniques presented here are based on the experi- 
ences of the American Institutes for Research in conducting 
the Vocational Competency Measures (VCM) project for the U.S. 
Department of Education as well as on previous test development 
experience of project staff. , - 



Overview 

The development of a vocational competency test requires 
the planning, coordination, and skillful execution of many 
activities. ,A test that has been developed following the pro-' 
cedures outlined in this module should provide supervisors, 
instructors, and students with information on how closely the 
skills taught and learned in the educational program compare 
with the. work standards and skills expected in industry. 

The approach used in this, module provides considerable 
flexibility in the development process, but at the same time, 
it has a sufficiently structured framework to provide clear 
guidance. Although the test development procedures are inten- 
ded for use in moderate to large test development efforts, 
small districts or individual schools will also find informa- , 
tion that can be adapted to smaller-scale projects. If you 
want a more detailed knowledge of competency testing or test- 
ing in general," the Recommended References in the Appendices 
should be useful. / ' _ 

The development of a vocational competency test is both a 
creative and a mechanical process. This module can only des- 
cribe the mechanics of a test development project. It is 
hoped that. the framework given will allow you to use your crea- 
tive talents most fully and effectively. 
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Instructions- to the Learner 



The Self-Check items and possible responses to them are 
found in the Appendices . These questions have two purposes. 
First, before you begirt work on the module, you may use them 
to check quickiy whether you have already learned the informa- 
tion in previous classes or readings. In some instances, with 
the consent .of your instructor, you might decide to skip a 
whole module or parts of one. The second purpose of the Self- 
Check is to help you review the content of modules you have 
studied in crdes to assess whetheVv^u have achieved the mod- 
ule's goals and objectives* 

You can also use the list o£ goals and objectives that 
follows to determine, whether the module content is new to you 
'and requires in-depth study, or whether the module can serve 
as a brief review before you continue to the next module. 
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Goals and Objectives 



Goal 1: • Summarize important considerations in vocational com- 
petency test development* 

Objective 1*1 State 5 the importance of test. validity. , 
test, reliability , and test practicality * 

Objective 1*2 . List important considerations in develop- 
ing tests -to include individuals with special needs. 



Goal 2: Summarize important, considerations in designing ini- 
tial test "specifications* 



Objective .2*1 State the purpose of designing' itvt^ial 
test specifications and list items to be included in the 
specifications* 

Objective 2*.2 Compare the strengths and weaknesses of 
paper-and-pencil tests and performance tests. 

Objective 2.3 Describe common formats of paper-and- 
pencil tests. 

Objective 2.4 Describe* types of performance evaluation. 

Goal 3 : Discuss the critical tasks in developing paper-and- 
pencil tests. 

Objective 3.1 Describe the importance and process of 

creating ah item budget. 

» -i * 

Objective 3.2 List Important considerations in the ini- 
tial review and .modification of test items. 

Objective 3.3 t Compare and contrast the processes of 
pilot^ testing and field testing. 

Objective 3.4 Based on field testing, discuss the bases 
for revising test items. 



Goal 4 : Discuss the critical tasks in developing performance 
tests. 

Objective 4.1 List the components of a performance test. 
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Objective 4.2 Identify key considerations in selecting 
and structuring tasks for performance test development. 

Objective 4.3 Describe key considerations in reviewing 
performance test items. 



Resources , 

In order to complete the learning activities in this mod- 
ule, you will need information contained in the following pub- 
lication:' * 



Erickson, R. C. , & Wentling, T. L. Measuring student 
growth: Techniques and procedures for occupational 
education . Urbana, 111.: Griff on Press,. 1976. 
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GOAL 1: Summarize important considerations in vocational 
competency test development. % \ 



Wha 



t Are I mportant Considerations in Test Development? 



The development of any test requires that certain techni- 
cal and practical considerations be kept in mind. The test 
developer should attempt to have the final test satisfy as 
closely as possible the three general requirements of every 
good test: validity, reliability, and practicality. The test 
developer should also ensure tha,t testing procedures give all 
examinees, including those with' special needs, a fair and 
equal opportunity to ,be tested on their skills and knowledge. 



Test Validity 

/ The validity of a test means the extent to which a test 
measures what it is intended to measure. The purpose or 
intent of a test, in turn, is always to relate to some cri- 
terion in the real world. When the test is first conceived, 
this intent is reflected in the careful selection of content. 
When the tfest" is tried out. in preliminary form, those items 
that seem to measure the criterion best are determined, and 
only those are included in the final form. An approach for 
validating competency tests is described in Validating Compe- 
tency Tests and Usinft Test Results , Module 20 in the VECS 
series. ' y 



Test Reliability 

Test reliability refers to the consistency of the test. 
A reliable test would yield close to the same score for the 
s/ame individual time after time. We can't expect any test to 
produce exactly the same results each time for the same sub- 
ject even if the subject learned nothing between the first and 
second testing. Guessing and other factors will influence 
scores on .repeated administrations, but a well-constructed 
test should yield essentially the same score each time. Thus, 
the relative standing of any group of examinees given a reli- 
able, test will vary only a 'small amount between different 
administrations. 
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Test 'Practicality 

' * < 

To be practical, a test should be feasible to construct, 
capable-.of administration without confusion, and easy to score 
with precision (see Adkins, 1974). This >advice is absolutely 
vital when applied specifically to vocational competency 
testing. 

Construction * The two most basic necessities for 
constructing a ties t are time and expertise.. An adequate and 
realistic amount of time should be budgeted for a test develop- 
ment project. The development strategy and scope of the test 
must be within the organizational capability and personnel 
expertise of the test development group. 

Administration . The administration of a test involves 
the examiner and examineas. 'In order to properly administer a 
test, the examiner requires clear, simple, and complete direc- 
tions regarding every required, task. The examinees must also . 
know what is required of "them. The internal simplicity and 
organization of a test depends on. the complexity of the occu- 
pation and the level of skill to be assessed by the test. A 
reasonable rule of thumb is to maintain the same" level, of 
technical complexity in the test as in the occupation being.' 
tested. 

Scoring . Scoring procedures that are straightforward and 
can translate easily into usa'ble summaries of student perfor- 
mance are clearly the most useful for vocational training. 

These three areas of practicality may overlap., but by con- 
sidering them independently we are able to get d. clearer pic- 
ture of the "practical" considerations/ involved in test 
development. 

Testing Individuals, With Special Needs 

The purpose of a vocational competency test is to assess 
job-related skills and knowledge. To meet this objective, it 
is necessary to plan the testing procedures to ensure that all 
examinees have a" fair and equal opportunity to be tested on 
their skills and knowledge. Judgments a'bout the competencies 
of persons "with special needs should be based on. their knowl- 
edge of the job and their capacity to accomplish important job- 
related tasks. — 

A helpful 'guide suggested by the National Research Coun- 
cil for use in modifying tests is the Guide for Administering 
Examinations to Handicapped Individuals for Employment Purposes 




(Heaton, Nelson, & Nester., 1980). After a series of modifica- 
tions to the testing procedures have been proposed, it is use- 
ful to have the modifications reviewed by a group of experts 
who are themselves handicapped. As the Council stated in its 
recommendations , 

No one knows as well as a knowledgeable and sensitive 
blind person, for example, what difficulties other * 
blind people are likely to encounter on a particular 
test (p. 135). . 

Modifications to the testing procedure .will also be necessary 
for persons with limited ability in English, unless knowledge 
of English is a requirement for performing the job satisfac- 
torily. - ' 

Table 1 lists some adaptation suggestions fojp the test " 
developers and the examiners; in addition, other reasonable 
local adaptations should be considered. 

Judgments about the competencies of persons with special 
needs should be based on their knowledge and abilities,' not 
their physical or linguistic limitations. Assessing abilities 
therefore., should be determined from the outcomes rather than 
from the particular method used to achieve these outcomes. 
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TABLE 1 m 
SUGGESTED -ADAPTATION PROCEDURES FOR SPECIAL GROUPS 



Special Condition Testing Problems 



Adaptations 



Hearing impairment Can't hear oral 

instructions 



Orthopedic upper 
limb disability 



Blindness/ low 
vision . 



Limited-English 
proficiency 



Can't complete 
response sheet 
blanka; diffi- 
culty in reach- 
ing or handling 
standard -tools 
or equipment 

Can't read 
printed test 
materials; can' t 
use charts and 
illustrations; 
can't see dials 
or markings 



Can't understand 
or read direc- 
tions in English 



Provide printed equivalent; 
use audio amplification 

Have assistant complete 
blanks; give test orally; 
permit the use of jigs 
and guides 



Read aloud and repeat; 
adjust testing time; pre- 
pare raised-llne tactile 
drawings with oral descrip- 
tions; use adapted equip- 
ment (e.g., use measuring 
tools that have raised 
tactile markings); use 
actual objects 

Use translations unless 
knowledge of English is 
a job requirement 
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Individual Study Activities 

1. Select one of the .three general requirements of every 
good test—validity, reliability, practicality— and write 
a one-page paper stating its importance in developing 
vocational coinpetency / tests. Refer to the Resources of 
Recommenced References in this module, or to one of your 
own resources for information on test validity, relia- 
bility, and practicality. 

2. Locate an individual with special needs (disabled or 
limited-English proficient) and interview that individual 
regarding the experiences he or she has had in being 
tested for vocational occupations. Did the tests provide, 
a fair assessment of that individual's skills: and knowl- 
edge? What adaptations, if any, were made in the testing 
procedures? Summarize your findings and share the infor- 
mation with the class. 



Discussion Questions 

1. What kinds of problems is a visually impaired person 
likely to face in being tested for the occupation of conn- 
puter operator? What kinds of testing adaptations might 
be made for this individual? After the discussion, con- 
sider- what you learned about your own and other class- 
mate's attitudes toward testing individuals with special 
needs. 

2. Discuss the implications of the following statement: 
"Validity is the first requisite of* any test,. No matter 
how satisfactory in all other respects, an instrument 
that does not provide to the decision maker accurate 
information of the type needed is worthless 1 ' (Erickson & 
Wentling,. 1976, p. 22). 



Group Activity 

1, Divide the class into small groups, with each group repre 
senting a different category of special needs. (For 
example, one group may select to be hearing impaired 
individuals.' Another group may -select' to be limited- 
English speaking.) Roleplay the problems' your group 
would face in being tested for a vocational occupation. 
.Create your roleplays around actual problems you know 
about from your own Experience or have heard about from 
other individuals. 
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GOAL 2: Summarize important considerations- in designing ini- 
tial test specifications. 



What Do You Include in Initial Test Specifications? 

The initial test specifications sertfe as the general blue- 
print. In designing, this blueprint the test developer must 
.consider the purpose of the .test and the limitations of the 
"environment" in which the test will be used. Some limits 
that should be considered are: 

• The amount of time that can reasonably be expected for 
* administering a test to examinees 

• The availability of equipment .and materials for perfor- 
mance testing \ 

• The grade or mastery level of the typical examinee 
The topics covered in an outline of test specifications are: 

• The types of measures to be used (paper-and-pencil, 
performance, or both) and their formats 

• .TKe total number of items in_the finished form 

• Total testing time 

• The skill level to be assessed, by the test 

• General reading level "of the instructions and questions 

As the test is developed, various changes are likely to be 
made, but the specifications should provide guidance and coher- 
ence to a test development project. 

Types of Measures to be Used and Their Formats 

The two types of measures considered most useful for a 
vocational competency test are £aperrandtpe.ncil tests and per- 
formance tests. To fully assess all skills taught in a vocar- 
tional education program, you will idkely-need to develop a 
test package made up of both paper-and-pencil and performance 
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tests. Both tests have strong and weak points which should be 
carefully considered when deciding If or when to use one mea- 
sure or tt>e other. 

Paper-and-pencil tests: Their strengths and weaknesses . 
By far the most common form of testing is paper-and-pencil. 
Its popularity as a test format is largely based on its flexi- 
bility* low cost, and ease of administration and scoring. A ' 
paper-and-pencil test typically requires no special equipment^ 
or specially trained staff. With a paper-and-pencil test it X 
is possible to test a sizable group of individuals at one 
time, making an effective use of classroom* time. 

I 

Paper-and-pencil tests do have drawbacks which can be sig- 
nificant in vocational education. With a paper-and-pencil 
test we can assess whether a student knows how to do a task, 
but we have little information about whether a student actually 
ban do a task. A written test can also distort or bias our 
assessments. Jor example, in assessing a person's knowledge of 
small engine repair with a paper-and-pencil test; we are at the 
same time assessing the person's knowledge of English, jreading 
skill, and skill in following written instructions. Overall, 
however, paper-and-pencil tests are still the most generally 
useful and practical means of assessing job knowledge. 

Performance tests: Their strengths and weaknesses, . A 
performance test in vocational .education involves the examinee 
carrying out the actions that are expected to be performed "on 
the job." A performance test can be structured in the form of 
a simulation , copying a work situation with something less 
than perfect fidelity, or as a work sample which usually 
involves an actual "slice" of the job. 

Motor skills and interpersonal skills, such as dealing 
with customers, are competencies which paper-and-pencil tests 
cannot usually assess. Since, most vocational educator's are 
concerned with "hands on" performance, a performance-abased 
test is appropriate for assessing most of the skills taught in 
a vocational education program. 

If, for example, an instructor of an auto mechanics course 
wants to determine whether the students can install a piston, 
correctly, the most direct way of finding out would be to have 
each student install a piston. A performance, test used in a 
standardized vocational competency test is simply the "teacher 
approach" with standardized procedures for conducting and 
assessing student performance. An advantage of this approach 
is that there is no intermediate task between what students 
are -trained to perform and how they are assessed. 




though this approach has significant advantages for 
assessment in vocational education, there are important draw- 
backs: , 

■ * 

1* Time— Performance tests are more time consuming. 

Often they must be administered on a one-to-one basis 
(one examiner to one examinee), with time required 
between each examinee for setting-up' the test. 

2* Cost — Special equipment and materials are needed and 
tfften consumed during testing. 

r Because of the time and cost constraints, performance 
tests may be limited. in the range of tasks that can be 
assessed. Consequently, for maximum efficiency-, ^hey should 
be used only for .those competencies that cannot be assessed 
adequately^by paper-and-pencil tests. 

Common formats of paper and pencil- tests . The four most 
commorTpaper-and-pencil test formats are: true-false, match- 
ing, completion, and multiple-choice. 

1. True-false— True-false test items have an advantage 
in ease of construction but should be limited to fac- 
tual material. Examinees can usually answer a large' 
number of these items in a relatively short period of 
time, allowing, a -broader sampling of knowledge than 
is possible with other formats within the same time 
period. Even when testing factual information, the * 
true-false format has serious disadvantages. Random 
or "blind" guessing will allow ? person, on an, aver- 
age, to~respond correctly to 50 percent o/ the, items. 
This high guess-factor could allow someone who knows - 
nothing about the subject to get half the items cor- 
rect, thus negating any advantage gained by asking a 
large number. of questions. 

2. Matching items — Hatching is a format in which ques- 
tions are .arranged in one column and alternatives or 
answers are in a second column. Examinees are asked 
to select the correct response in the second column 
that corresponds to the question, word, or statement 
in the first column* This format can be made to tiavfe 
a high difficulty level and reduce the proportion of 
correct answers through guessing. TJie questions must 
be closely related so that, for any question, the 
incorrect choices will serve £s reasonable distrac- 
tors. The listing of items should be kept relatively 
brief (10 to 15 alternatives) so that finding the 
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correct response does not become tedious. When making 
up the two- lists for matching, it is recommended that 
one column contain several more items than the other. 
This lessens the chance of selecting answers based 
simply on the process of elimination. 

3. Completion items — Completion items (sometimes referred 
to as "fill in the blanks' 1 ) are deceptively easy to 
construct and have no significant ^guess factor. Scor- 
ing, however, is difficult since no specific respon- 
ses or. options are given* and responses cannot be 
machine-scored. A subject matter expert may be 
needed to determine the correctness of every unanti- 
cipated response. Such a scoring procedure is slow, 
expensive, and borders on the subjective. Under con- 
ditions where there are no guarantees of the exper- 
tise of the scorer or where a large number o,f tests 
are to be administered, this format is not practical. 

4. ' Multiple-choice items; — A multiple-choice format is 

usually the most desirable format for knowledge items 
in a vocational competency test. In general, a 
multiple-choice item consists of the stem, which may 
be either a complete or an incomplete statement, and 
several responses that answer the questiod directly 
or complete the statement. Of the "responses given, 
only one is correct,, and the rest serve as distrac- 
tors (see Table 2). 

TABLE 2 

COMPONENTS OF A MULTIPLE-CHOICE ITEM 



An engine that fires each time the piston goes STEM 
up is a 

Av - 2-cycle .engine. , CORRECT RESPONSE 

B. 4-cycle engine. \ 

C. rotary engine. ( 

* D. supercharged engine. j DISTRACTORS 

E. turbocharged engine. ' 



The number of responses usually varies between three 
and five. If three responses are used, the possi- 



bility of guessing correctly is about 33 percent; for 
four items, "~25 percent; for five, 20 percent; and so 
forth* Five responses are recommended as giving an 
appropriate balance between item length and the 
effects of guessing. 

Constructing disfractors (the -incorrect responses) is 
often the most difficult part in writing multiple- 
choice test items. Each distractor must: 

• agree grammatically with the stem, 

• be rational and believable, 

. • be related to the topic area, 

• NOT contain any hints to the correct answer, 

• seem plausible and attractive to the uninformed or 
poorly prepared examinee, and 

• BE ABSOLUTELY WRONG . 

The use of responses suctTas "all of .the above," "A ' 
aAd B but not C," £tc, should be avoided in most 
cases* These types of responses can be useful under 
some circumstances, but they can easily become 
* crutches for item writers. 

Although the multiple-choice test item format is the most 
widely used one in large-scale testing efforts, other possible 
formats should not be discarded. It is ajso^common to mix 
test formatSj if for no othet reason than to break the mono- 
tony involved in testing. The test designer should use the 
most appropriate format for each situation and weigh the bene- 
fits and' drawbacks inherent in each. 

Types of performance evaluation . Two types of evaluation 
procedures usually used for performance testing are product 
evaluation and process evaluation. 

1. Product evaluations-Evaluating a product is usually 



the simpler of the two because, the examiner- evaluates 
the results of actions and not the actions themselves. 
Products can be measured or checked against specific 
standards anytime after the examinee has completed 
the test. Such things as a completed electrical cir- 
cuit or a carburetor rebuilt ,to factory specifica- 
tions are examples of products that* can be evaluated. 




2. Process evaluation — A process evaluation means that 
actions and behaviors are assessed while the activity 
is in progress. For example, the evaluation of meal 
service by a waiter, or lifting a patient properly 
from a bed by a nurse's aide, are evaluations of the 
process performed. An actual product may or may not 
* result. The actions are viewed as the most important' 

component of the job task. 

i 

The type of performance evaluation used will depend* on 
what is being assessed. Often, a performance test is partly 
process and partly product based. 
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Individual Study Activities 

,1. Select an occupation from your area of occupational spe- 
cialty and develop an outline of test specifications for 
that occupation. Consider whether you will use standard- 
ized test;s or develop your own. Consider the types of 
measures you will use (paper-and-pencil, performance, or 
both) and their 'formats; the total number of test 'items; 
total testing time; the skill level to be assessed by the 
test; and the general reading level of the instructions 
and questions. 

2. Using the Resources or Recommended References in this 

module, or one of your own resources, develop two charts — 
one listing the strengths and weaknesses of paper-and- 
' pencil tests and the other: listing the strengths and 
weaknesses of performance tests. Then compare the , two 
charts and determine which type of test has the greatest 
overall strengths to meet the needs in your particular 
setting. 

Discussion Questions . / 

1. "Measurement techniques for assessing occupational stu- 
dents 1 achievement within the cognitive domain have been 
used much more extensively than those used to assess 
achievement in the affective, psychomotor, and perceptual 
domains" (Erickson & Wentling, 1976, p. 86). Using this 
statement as a basis of discussion, provide examples from 
your own experience that support this statement. 

2. "The ultimate in performance measurement for occupational 
education -is the assessment of a student's ability to per- 
form important job-related tasks' In an actual job setting" 
(Erickson & Wentling, 1976., p. 126) This type of mea- 
surement, however, has not been used extensively by voca- 
tional educators. Discuss why this is so and suggest 
ways that performance tests might be used more. 



Group Activity 

1. Divide the class into four groups, each group representing 
one of the four types of paper-and-pencil test formats: 
true-false, matching, completion, and multiple-choice. 
Each group will defend the use of its particular format 
in a vocational competency test. 




GOAL 3: Discuss the critical tasks in developing paper-and 
pencil tests. 



. How Do You Develop Paper-and-Pencil Tests? 

Developing paper-and-pencil tests is a systematic process 
.consisting of a number of specific tasks* A discussion of 
these tasks follows. 

Create Ari Item Budget 

The "budgeting of items" is a procedure for determining, 
the number of paper-and-pencil test items to develop within 
each major test area. Your decisicms about an item budget 
must be based on your best estimate; no hard and fast ruJLes 
exist. 

The first step is to establish the content of the item 
budget, that is, whatr.will be the areas of an occupation to be 
covered in the test. The content is derived from the task 
inventory findirigs listing the skills and abilities that - • 
employers and employees consider important. (See Module 18: 
Determining Requirements for Vocational Competency Measures 
for a discussion of developing a task inventory.) 

The next step is to estimate the number. of items for the 
final test. For example, if between 50 and 60 test items are 
wanted for the , final version, at least double that number of 
items, should be prepared initially. The .test developer must 
then assign a percentage of the total number of test items 
being developed to each major category. Examples of major 
categories derived from^an auto mechanics task inventory could 
be "safety," "trouble-shooting, " and "tune-up procedures." 

If you decide to prepare 100 items initially, and the 
test will have five major categories, you could assign the 
same number to each category'. In this case, you simply have:' 

100 -r 5 - 20 * 

Often, some categories are more important to the job than 
others; you may wish to assign a larger proportion of items to 




those categories • The distribution, theji,» could resemble the 
following: 



No. of Items 

Category Per Category 

A 30. 

B 10 1 

C ' 20 

D 25 

E 15 

TOTAL • 100 



Once the numbers of test, items to be developed are 
assigned to the major categories, the number of items to be 
developed for each individual area or task within each major 
area in the task inventory must be determined. If, for 
example, a major category is allotted 30 questions* such as 
category "A" above, and if the category has five specific 
tasks, then the 30 questions must be assigned to these five 
tasks* 

a? . y 

Often, tasks to be used as the basis for item development 
are rated by those surveyed as to their importance. If, for 
example, tasks were rated either "moderately important"' or 
"very important," such that: 

Assigned Weight , 

• 1 « moderately important 

2 very important ' . . * 

then the first category could be assigned "1" and the second, 
"2." We could assign test items per task using the following 
simple procedure. We sum the total weight af all the tasks: 



Assigned 

Task Importance Weight 



1 moderate * 1 

2 very * 2 

3 very - * 2 

4 moderate * \ 

5 moderate * *■ I 

. . 7 
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We then divide the allotted number of- questions:, 3,0, by the 
total weight, 7: • 

30 t 7 - 4.3 

The-result K 4.3, ! is then multiplied by the weight of the task, 
which in this case is either 1 or 2< 



Assigned Weight No. of Questions 

Task Weight ' Multiplier / Per Task a 

1 1 x 4.3 - - 4.3 

'2 2 x 4.3 , - ' 8.6 

3 2 " x 4.3 - 8.6 

• 4 1 x * 4.3 - , - 4.3 

5 * l x 4.3 - - ' 4-3- 

Since these results are not whole numbers, it is necessary to 
round to tlK nearest whole number to get the appropriate num- 
ber of quest ms to develop per task, so that the final result 
is: 

No. of Questions 
. Task Per Task 

1 4 

2 9 . 

* 3 , 9 : 

- ' 5'4 . 

_ TOTAL 30 

Tasks that are different aspects of the,sarae general 
skill are' best grouped so as not to overtest in one area -at- -* 
*the expense of other tasks. In retail sales, for example, 
'•prepare- a sales receipt" and "prepare .a refund form" can 
easily be combined since the skills involved are essentially 
the same.. ' «, - - 



Assign Test Items 

Based on the test item budget, assign each item writer a 
* specific number of test items to write in specific categories* 
Supply each writer with detailed instructions regarding the^ 
nature of the task, specific requirements-, and the format of 
-the^ques tions-to_be_dev.elope4„._ Tabled is a listing of areas 
that might be covered in instructions to item writers. These 
instructions should clearly reflect the design goals of the 
test developer. 
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Of course, items should be assigned to writers who are 
knowledgeable abbut the specific tasks. But it is a good idea 
to ask a second writer' to submit items on the same. tasks to 
vary the emphasis and point of view*. 



9 
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SUGGESTED INSTRUCTIONS TO TEST ITEM WRITERS 



# — EstablislTa^'tlsre^rimit -f OT~ttem~p repara fri-on . 

• All items should ber sent to the test development center (give address). 

• After typing and editing at the development center, all items will' be 
returned to the writer for .review to make certain that editorial 
changes have not changed the technical content. 

• For each item, indicate the correct answer and the topic to which the 
item is related. 

• Staff members will meet with item writers as appropriate to discuss 
problems or concerns,, and to review. reasons for editorial changes. 

\ _ / 

• Five-choice, multiple-choice items are preferred. Items with fewer 
than five choices will be accepted if there is good reason for the 
smaller number of choices. ' , 

* 

• The stem of each item should be a complete question, or an incomplete 
statement. Stems with a blank in themiddle should be avoided. 

• All options must relate to the stem logically and grammatically* 

• Discourage the use of options such as "All of the above,"" "None of the 
above," and "A and B above. "* , 

• Stems and options should bd as short as possible while still being 
complete. . / 

% Never repeat a word in the options if it can be included in the stem. 

• Options should be arranged in some logical order such as: 

Numerical 

Alphabetical (for one- or two-word options) 
Length (for multi-word options) 

• Avoid the use of words ^uch as "always" and "never." 

• Make certain that level of reading difficulty is appropriate to 
audience. - 4 



1 Though options of this type may ,at times prove useful, item writers 

should be initially discouraged from their use because they are apt to 

be-misused. If-,, f or example.,, item writers are asked to prepare five-- 
option multiple-choice item's, it becomes extremely tempting to have a 

fifth option as "All of the above" or "None of the above" because it is 
an easy distractor to write. 
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Review and Modify Items 

The test development staff should review each completed 
item for clarity, completeness,, correct grammar, etc. Items 
that are unclear, unimportant, or have content problems requir 
ing technical expertise should be discussed with the item 
writer* Occasionally, when different individuals prepare 
items for the same occupational task, duplicate or near dupli- 
cate items will be written. You can keep the best item and 
place the other in reserve, or possibly even combine the two 
to produce a better .test item. 

Then compare the selected items to the item-budget. You 
may find that you have more stems than you need in one area 
and too few in another. In areas of shortage, the test devel- 
oper should work closely with the item writers tp stimulate 
ideas for questions. A "brainstorming 11 session* can be very 
productive. Once there is a ciosd match with the budget-, »the 
items should be prepared in final form. The language and 
structure of the questions and their options- should .be logic- 
ally and grammatically consistent. All the items should then 
be looked at as a unit so that extraneous cues'can be removed. 
A common error is that one test item will cue the correct 
response in another item. 

The items should. then be reviewed by subject matter 
experts, not involved in the actual writing. You might have 
vocational instructors or persons in, industry not previously 
involved act as reviewers. Be sure to remove items that are . 
factually in error or items in which a disagreement exists 
about the correct response. A good strategy is to have the 
experts M take the test." Although even experts, can make 
errors, it should be possible to catch most of the distractors 
that may be correct responses to the question. 

Table 4 lists questions that should l be- considered in 
reviewing each item. 
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TABLE 4 

-ARE^Or-CONCBRS'ifflEN REVIEWING EAPER-AND-PENCIL TEST ITEMS 



Is the content of the item logical? 
Is sufficient information provided? 
Is the use and spelling of technical terms correct? 
Is there one and only one best answer? 
Is the correct answer keyed? 



• How difficult is the -itein apt to be for a typical 
student? , 

• Can the item be improved? If s v o, fiow? ' - m 




Balance the Key 

Balancing the key is one of the final stages in the prep- 
aration of multiple-choice test items* It ensures that each 
possible option has approximately an equal proportion of cor- 
rect answers assigned. It helps to eliminate any kind of bias 
that test designers or item writers may have in the placement 
of correct answers. Item 4 writers have a tendency to plan the 
correct response as one of the middle options. 

The, balancing procedure to . use i"s very simple:- 

1. List separately the correct response for all five- 
choice items, four-choice items, and three-choice - 
items. 

2» Sum the number of five-choice, four-choice, and three- 
choice items separately. 

3. Divide each sum by the number of choices for that 
category. 

4* The result will tell how many A, B, C, etc*, correct 
responses should, be in each category. 



5. Compare the results against the ; actual number q£ 
- items in each category, 

6. Rearrange the options, whenever 'feasible, to match 
the balanced key. 

The follqwing is an example of .this balancing procedure, 
using 'five-choice items. Th,e same procedure is used for four 
and three-choice items. 



Item 


Present 


Adjustments' 


' -Revised 


Choice 


Key 


Needed 


Key 


A 


6 


+2 


8 


B 


12 


-3 


9 


C . 


18 


-9 


9 


.D 


1 


« +7 


8 


E 


5 


+3 


8 




TOTAL 42 




TOTAL 42 


Number 


of items t 'number 


of choices per 


item » 42 











We need about 8 questions having each item choice. In 
this case since the total is 42, two item choices must 
have one extra each. 



When balancing the key, you cannot simply move options around 
at will. The option item ordering guidelines described in 
Table 3 should be maintained. 

> 

Prepare Test Administration Instructions 

The best test can be a poor assessment tool if the 
instructions for its administration are not cle^r and specific. 
One method of -providing these instructions is to prepare an 
examiner's manual* This manual should contain all the informa- 
tion required for administering the test. The manual is also 
useful in supplying background information to aid the examiner 
in understanding the overall structure and purpose of the test. 
Table 5 lists content.areas that might be covered in aii 
examiner's manual. 
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" TABLE 5 

* SUGGESTED CONTENT AREAS OE AN EXAMINER'S TEST MANUAL 



• Purpose of the' test 

• Expected background of examiner 

• Ways test results may be used 

• Overview of the test 

• Administration of the test 

• test format 

9 time required 
? # answer sheets 

• instructions for administering the test 

• suggestions for testing individuals with special 
needs 



Review and Revise (The Second Time) / 

When the test is completed, the entire package should be 
reviewed by the members of the' test development consultant • 
panel. To ensure useful reviews, detailed instructions should 
be supplied to each reviewer* Table 4 can be used as a model 
for the content of instructions to test reviewers* 

Test revision based on the comments of review panel mem- 
bers should be carefully performed to avoid introducing errors 
into the test. Any changes of factual content that seem ques- 
tionable should be -submitted to the item writers. Carefully, 
review any area where there are differences of opinion* If "it 
is hot possible to agree on the -correct answer, give serious 
consideration to deleting the item* 
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Conduc t a' Pilot Test 



Once the test has been revised, it is ready for pilot 
.testing* Pilot testing is a procedure used primarily to test 
the structure of the test, e.g., clarity of instructions, . time 
estimates, etc., rather than, the content. The procedure for 
conducting a pilot test is /to have two or three* vocational 
training programs each administer the test to two or three 
students. The tests should each be at a different school with 
different examiners and students. A packet of the complete 
test instructions should be sent tQ e^ch examiner, one or two 
weeks in advance. 7 

The examiner should be informed that the test \designer 
will be at the test site ks an observer and to debrief the • 
examiner and examinees. Make it very clear that the test 
itself is being tested and that any problems with the, test are 
not a reflection on the examiner or student but rafher on the 
test. Ask the examiner and the examinees to be completely 
frank and not worry about hurting the feelings of the test 
designer. - 

During testing, the test designer will act as an observer 
and should not supply assistance or answer questions. It ? is 
appropriate, however, to stop and ask questions for clarifica- 
tion. Careful notes should be kept, noting the strong and 
weak points of test administration. 

After completion of the test, the observer should review 
the test with the examiner and students. An ite<a-by-item 
review is most desirable. On completion of the pilot tests, 
corrections should be made to the test based on the findings. 
The findings of pilot testing will help improve instructions 
and "packaging" of the test. Once these problems are cor- 
rected, you are ready for field testing. 

Conduct a Field Test / j 

Field testing is used to determine the quality of the test 
Items. The field test'involves a large number of students in 
a number of schools. Field testing requires considerable pre- 
planning in that agreements must be obtained from the schools 
and instructors. , There are' no hard and fast rules regarding 
the size of the sample. Approximately 100. students spread 
over five schools is a reasonable target. 
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Analyze Field Test Findings' . 

Field testing will provide the test developer w£th a larg 
amount of data to /analyze. If possible, the developer sihould 
work closely with someone skilled in statistical analysis* To 
do a good job of analysis, access to a computer is necessary^ 
If you have access to a computer, most likely you will find . 
one of the numerous statistical software packages in place* 
Most are simple to use and can accommodate Che level of statis 
tical analysis required for item analysis purposes. 



Revise the Test ' 

In order to decide which items to keep and which to dis- 
card, it is necessary to get statistical data on each item to 
answer the following questions. 

~1. How well ^ioes examinee performance on each item cor-r 
relate With overall examinee performance on all items 
in a particular subject matter area? (How well does , 
an item discriminate between those who" perform well 
and those who perform poorly overall?) 

2. What was the difficulty level; that is, what propor- 
tion of examinees answered an item correctly? 

Your prime concern should be to select items that per- 
formed well in the field test and to end up with a test that 
has about the same percentage of items in each section as 
specified in the item budget., Your criteria for item selec- 
tion should be threefold:* 

1.. Cohtent — The items should be representative of the 
performance area. 

2. Difficulty—What was* the proportion of examinees that 
missed any item? In competency tests, items should 
be included over a range of difficulty to reflect a 
range of competency. 

3. Discrimination — Do" those who do well on the test gen- 
erally Answer the item correctly and those who do 
poorly overall generally answer , the item incorrectly? 

In analyzing the paper-and-pencil items, consider these 
points. Distractors that were chosen by few examinees might 
need to be modified or deleted. Items answered correctly by 
all examinees should be eliminated or changed. Items which 
are seldom answered "correctly" should be checked for keying. 




If the key is correct, they should be carefully reviewed by 
subject matter experts to determine whether there is a possi- 
bility of misinterpretation. 

After you have s.elecfed test items that are acceptable on 
the basis of content, difficulty, and discrimination, make 
sure you have the same item distribution as** you had on the 
field test. If some areas do hot have enough items, look 
again at those items not selected. 

Generally, poor items should be removed rather than 
rewritten, but if the change is small the test Heveloper may 
want to make it in order to have a useful item. If a major 
change is required, then the rewritten item should 'be" con- 
sidered a new item that has not been field tested. 

On completion of the final revision, you should have a 
useful paper-and-pencil test meeting the* design requirements 
established at the start of t the project. - 
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Individual Study Activities 

1. Obtain a teacher-made occupational competency test from 
your area of occupational specialty. Randomly select 10 
multiple-choice, paper-and-pencil test items and review 

; them to answer the following questions: Is the content 

of the item logical? Is sufficient information provided? 

^ Is the use and spelling of technical terms correct? Is 
there one and only one best answer? How difficult is the 
item for a typical student? Can the item be improved? 
If so, how? Revise^SeJttest items based on your review/ 

2. Obtain a task inventory for an occupation from your area 
of specialty. Select two of the very important tasks and 
write five multiple-choice test items for each task; 
Follow the suggested guidelines in Table 3 of this mod- 

" ule. Then exchange your items with a partner and review 
.one another's items based/ on the review questions above, 
' Return the items to the writer and revise them according 
to your, partner's suggestions. You mus"t determine, of 
course, whether the suggestions are appropriate. 

% ' 

Discussion Questions 

i r 

1. This module discusses a number of specific tasks for 
developing -paper-ahd-pencil tests. How practical are 
these tasks for your particular setting? If the members 
of the class were part of a district-wide test develop- 
ment team, which of the £est development tasks would yoyx 
expect to be able to accomplish? Write these tasks >on 
the chalkboard .and reach a consensus regarding the tasks 
you will do. 

2. Why do you think it is necessary to develop the examiner's 
test manual as part o£ a total test package? What topics 
would you include in such a manual and why? 



4 

Group Activity 

1. Organize class members into a test development team. 
Select a leader and assign responsibilities among mem- 
bers. Your goal is to produce a paper.-and-pencil voca- 
tional competency test for an occupational area of need 
in your district or state. If all the Skills necessary 
for test development are not represented on your team, 
indicate what skills are lacking and where you would 
obtain that expertise. - 
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GOAL, 4: Discuss the critical tasks in developing performance 
• tests. 



How Do, You Develop Performance Tests? 

* Many of the tasks required for performance test develop- 
ment can be /carried, out in conjunction wi?th those required for 
the development of paper-and-pencil tests* 

The following five components should be covered in each 
performance test: 

1. Purpose— Statement giving overview of. the task, the spe- 
cific s'ubtasks 5 involved, and. the uses that can be made of 
the findings. This statement is for the use of the 
examiner. 

2. Instructions to examiner— Written descriptions/ 
instructions of exactly what an examiner is expected to 
do step-by-step during all aspects of testing. The 
instructions should be detailed. and specific. 

3. -List of. required equipment— Detailed description, of the 
exact layout of the- test site and all required materials. 

4. Instructions to examinee— Written instructions either read 
by or to the examinee. They must be brief yet complete, 
giving no extraneous information ^that could distract the 
examinee .from- the task at hand. 



5. Rating form— Document that lists checkpoints for assess- 
ing the job competency of the examinee on the specific 
task being performed. Each checkpoint should have a rat- 
ing checkoff on which the examiner can quickly record the 
. correctness or acceptability of the examinee's perfor- 



mance. 



Seiect- the Tasks to be Tested 



Performance tests are more difficult to construct and 
moife" expensive to administer-^ han paper-and-pencil tests. 
Therefore, performance tests Should be developed to cover only 
important areas that cannot be tested adequately using a paper- 
.and^-pencil test. It is important to discuss each proposed 
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area with subject master experts. Some questions to ask in 
selecting tasks for a performance test are: 

1. Can the skills required to perform the task only be 
assessed adequately with a performance test and not 
with a paper-and-pencil test? 

2. W^s the area rated as* important by employers cwd 
employees In the task inventory, survey? 

3. Are the equipment ^or required -materials available or 

easily obtainable at a training site? 

*» 

4. Will thfe cost in consumable items per examinee be* 
reasonable? 

Structure the Tasks ' ■ 

As you can see from the list of components' of a perform 
mance test, the final package is' much more complex than a 
paper-and-pencil test package. It is the designer 1 s job to 
make all the parts fit together so that those using the test 
feel it is simple to set up, . administer, and score.. 

First, the task should.be divided into all its observable 
behaviors and products. For example, if an examinee is asked 
to replace the head gasket of an automobile engine, the task 
should be broken down* and each step recorded. This includes 
even those actions that seem- trivial* In the case mentioned, 
we would start with: 

1. Examinee opens hood of, car. 

2. Determines model of engine. * 

3. Checks to see whether proper gasket is available*. 

The. listing would be finished when the examinee closes the hood 
of the car. The list should be ordered as closely as possible 
to the way the task is" typically performed. The test devel- 
opers should then look for possible products resulting from 
the process. In our example, such points as "heads torqued to 
proper specification," "gasket ■ straight and not leaking" are 
potential products that can be measured. 

From the list, those points that are (a) important to the 
job, (b) likely to be performed wrong, and^(c) relatively inde- 
pendent of other actions required for performance of the task 
should be extracted and used as assessment checkpoints. The 




checkpoints should Chen be written as descriptions of^the 
action, and in enough detail to. ensure that en examiner can 
make a judgment abaut correctness of a behavior/action. Too 
much detail can make it difficult for an examiner to rate an 
action in an ongoing work situation. 

Table 6 is an example of checkpoints at the proper level 
of detail for one important task in overhauling a diesel 
engine. 

Prepare Administration Instructions ■ * \ 

To ensure that a performance test is administered the same 
way to all examinees, detailed administration instructions must 
be prepared. An examiner's manual should cover all the com- . 
pbnents-of-a- performance test. It should also include sugges- 
tions for testing individuals with special needs* Though the 
instructions to the examiner- should be detailed enough to pre- 
vent variation between tests, , they should not be unnecessarily 
wordy. 

Develop Rating Sheets -/ 

Rating sheets on which the examiner will rate examinee / 
performance should be easy to follow and at the same time 
permit a thorough evaluation of the- examinee' s performance. 
The rating form should allow an examiner to observe and rate 
the examinee on each checkpoint without missing any of the 
examinee's ongoing performance on the test. Table 7 is an 
example of a rating sheet derived from the checklist shown in 
Table 6. Note that product checkpoints 12-16 have been added 
to the initial process checkpoints.- 

Review, by Consultants 

* 

As with pape. -and-pencil tests, performance tests should 
be reviewed by subject , matter experts in industry ancf by voca- 
tional educators. The points' that reviewers should keep in 
mind are listed in Table 8. If a paper-and-pencil test is 
developed^ it would be most efficient to have both the paper- 
and-pencil and performance tests evaluated by the seme indi- 
viduals at the same time. The reviewers can then look, at the 
complete test package." 
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• TABLE 6 • 

CHECKPOINTS USED IN THE DEVELOPMENT OF A 
PERFORMANCE TEST FOR DIESELHECHANIC 



Performance Test: Measure Gap and Side Clearance 
and Install. Piston Rings 

<> * 


Task 


Performance Criteria Checkpoints 


Measure ring end 
gap clearance. 


1. Measures each ring 
individually. 

2. Uses piston, to push each ring 
into sleeve. . 

3. Makes sure rings are ■ 
'straight in sleeve.. 

4. Uses feeler gauge to measure 

gap* . * • 


Measure ring to 
groove side,, 
clearance. 


5. Holds each ring against .p^per 
groove on piston or mounts 
rings in proper grooves. 

6. Uses feeler gauge to measure 
side clearance, of each -ring. 


• Install rings on 
piston. 


7. Uses expanders to install each 
ring on piston. 

8. Installs each ring- in proper 
groove • 


Prepare for 
installing piston 
*in sleeve. 


9. Staggers ring gaps. 

10. Insures^that no ring gap is in, 
line with wrist pin hole. 

11. Cleans and puts tools avay. 



0 



^ 
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• • * TABLE 7 . ' 

A SAMPLE RATING SHEET FOR DIESEL MECHANIC 



P.rf„ rMne « TMCi Measure Can and Sid* Clearance and Inscall PUcon Rings 

» * 

Perforaance Tesc Record- Sheec 



Ex'aainee^ 



Examiner 



Diet 



Mo. Day Yr. 



School/Employer 



Examiner ausc tnter ch 



. ,,rr>» atajUtttttncs and ehe specifications froa the 



E^c^hl^Lln:^ ; performance on cha following casks by checking eich.r ch. 
M Yes" or "No" column. 



Task. 



Perforaance Criteria , 



Yes 



No 



Measure ring and 
jap claaranca 



1. Measure* aach ring individually 1. 

2. Uses piscon co'push aach ring ,inco sleeve' 2» 

3. Makes sura rings ara scraighc in sleeve 3. 

4. Usas feeler gauga co measure gap 



Measura ring co 
groove sida 
claaranca* 



Inscall rints on 
piscon 



Prepare for 
inscalling piscon 
in slatva 



5. Holds each ring agains,f -proper groove -on 
piscon or aouncs r,ings in proper grooves 

6. Uses feeler gauge co aeasure-side clearance ;^ 
of each ring 



Uses expanders co install each ring on 
piscon 

8. Installs each rin< in proper groove 

9, Scaggers ring ?aps 

10. Insures chat no ring- gap is In line wicn 

wrist pin hole 
U. Cleans and pucs cools avay 



10. 
II. 



Top ring: gap 



12 Manual .specif icacion recorded by exaoinee 

( J macches manual speci*ica- ^ 



cion ( 



I?. Cap ceasured by examinee (_ 



) 



macches gap neasured by examiner 

(_ — ) 



Top ring: side- 
clearance 



15 Manual spec'ificacion recorded by examinee 

( • ) macches manual specifics- « 



cion ( 



) 



16 



Side clearance measured by examinee 

( ' ) macches side 4 clearance 

measured by examiner' (_ J 



Second ring: gap 



18. Cap measured by examinee ( 

- j'i.~~A Kv «y 



oap aaiiu. «u vj — 
macches gap measured by examiner 



13 



Second ring; side 
clearance 



19. Examine correctly checks "SO" < rln * d0 " {$ 
noc need replacement) 

20. yjmu*V»pte±fieacioii recorded by laminae 
* Wi ( ) macches manual specifica- 



tion (1 



) 



20. 



21 Side clearance measured by exacinee 

\ ) macches side clearance 



measured oy exaainer 



) 
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TABLE 8 

CHECKLIST FOR REVIEWING PERFORMANCE TEST ITEMS 



• Is the situation realistic? 

• Are all required supplies and materials listed? If 
not, what is missing? 

• Are the instructions adequate? 

. Are all the relevant topics for evaluation included? 
If not, What topics are missing? 

Are the. topics for evaluation listed in the order in • 
■ which, they would be-carried out by an examinee taking 
the test? 

. Are the materials provided sufficiently clear and com- 
plete? If -not, what should be added to make them more 
satisfactory.? 

• Are the approximate time limits indicated in the test 
satisfactory? 



How can the problem be- improved^ 



Test the Performance Test 

The procedures for pilot testing and field testing a per- 
formance test are -basically the same as those used for a -paper 
and-pencil test. 

The individual performance tasks should correlate 
positively-but not necessarily highly--with £ 
test. If one or more correlate very highly, that .raises a 
Suescion of the co'st effectiveness of these performance mea- 
S£ Another concern is if a performance .test has a nega- 
tive correlation with the paper-and-pencil test. The correia 
tions among performance measures should also be reviewed. 
Ideally, they will have low-positive correlations. 
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Within each performance test, checkpoints that are per- 
fo-raed correctly by all or nearly all of the examinees should be 
carefully reviewed. The importance of the item to the occupation 
and the task should be considered. Checkpoints that are missed by 
all or nearly all examinees should be restudied to determine what 
is causing the problem. 

A.s part" of final revision, the test developer should spend 
time working on the appearance and layout of the test. Every 
'-effort should be made to reduce the amount of paper and paperwork. 
The final version of the test should be clear, well-organized, and 
easily administered, kt this point, the developer should have a 
complete test package that will give objective information to • 
directors of vocational education, instructors, and-s^udents- about 
■the capability, strengths, and weaknesses of students and programs. 
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Individual Study Activities 

1. Obtain a teacher-made performance test In an area of your 
occupational specialty* . Review the test to answe'r the 
following questions: . Is the .situation realistic? Are 
all required supplies and materials listed? If not, what 
is missing? Are the instructions adequate? Are all the 
relevant topics for evaluation included? If not, what 
topics are missing? ,Are the topics for evaluation listed 
in the order in , which they would be carried put by an 
examinee taking the test? Are the mateials provided suf- 
ficiently clear and complete? . If not, what should be 
added to make them more statisf actory? Are the approximate 
time limits indicated in the test satisfactory? How can 
the test be improved? < 

2. Obtain a" task inventory for an Occupation from your area 
of specialty* Determine which tasks' would be appropri- 
ately assessed by a performance -test* Some questions to 
ask yourself are: Can the skills required to perform the 
task only be assessed adequately with a performance test 
and not with a paper-and-pencil test?. Are* the equipment 
and required materials available or easily obtainable at 
a training site? Will the cost in consumable items per 
examinee be reasonable? Provide reasons for your selec- 
tion of tasks for performance testing^ 

Discussion Questions 

1. "In assessing student performance in occupational educa- 
tion programs, either or both the process or ptodiict of 
the , task should be measured. Both product and process 
assessments have their advantages — a decision must be made 
regarding which should be used" (Ericfcson & Wentling, 
1976, p. 128). From your own experiences, can you think 
of situations in which it would be advantageous to look 

at both process and product? Discuss these situations. 

2. "The common conception that' paper-and-pencil .tests of 
performance can 'only measure cognitive functioning is not 
entirely true. Many paper-and-pencil tests can provide 
direct assessment of job performances" (Erickson & 
Wentling, 1976, p. 155). From your own experiences, can 
you think of examples of a written test that in fact 
serves as a performance test? Discuss these examples. 




Group Activity j . 

1. Organize class members into a test development team. The 
goal of tta;e team is to produce a performance test for an 
occupational area of .need in your district or state. As 
team members, determine which specific tasks you will 
reasonably be able to carry pat in your particular setting 
to achieve this goal. Prepare an action plan for develop- 
ing the performance .test, indicating tasks and individ- 
uals responsible for carrying out those tasks.. 



5 
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Summary 

The prcpcedures described* in this module, in combination 
with the other three modules of this series,, will help you 
develop a useful and valid vocational competency test* This 
module has focused on techniques and prQcedures essential to a 
good test development effort ♦ 

Another component of equal importance, but beyond the 
scope of this module^ is teamwork* Developing a vocational / 
competency test is not a one-person job* It requires the 
input of many people'. Without this input, t\e developer would 
probably have only a- "classroom t*§|t made large" and not a 
test that realistically measures occupational competency as 
determined by business and' industry 

A test designed following these procedures -will serve as 
more than simply an assessment of student performance. It 
will permiLprLQgrams to be evaluated on the basis of' those, 
skills desired by business and industry* . ' 

We- feel Vocational training- programs should not focus on 
only those areas covered in a test. For the very practical 
considerations of time and budget, 'a .test can dnly be a sample 
of the topics included in a good training program. Neverthe- 
less, the results of a good test assessing important compe- 
tencies will definitely contribute to improving vocational 
education programs. 
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Self-- Check 



GOAL 1 

1. What is the importance of test validity , test reliability, 
and test practicality in vocational competency test 
development? * K 

2. Why is it important to develop test procedures that accom- 
modate individuals with- special needs?- 



GOAL 2 

1. .What items should be included in initial test specif i- 

cations? 

2. For .both paper-and-pencij. tests and performance tests,. 

' what are, two major strengths and two' major Weaknesses ,of 
each? * 

3. What are four common formats of paper-and-pencil tests? 
4; What are two types of performance evaluation? 

GOAL 3 

1. What is the purpose of creating a test item budget for 
paper-and-pencil tests*? . 

2. What are important considerations in initially reviewing 
and modifying paper-and-pencil test items? 

3. What are the differences between pilot testing and field 
testing vocational competency tests? 

4. After field testing, on what basis should paper-rand-pencil 
'test items be' revised? 

GOAL 4 

1. What are the components of a performance test? 

2. What is the major consideration in selecting tasks .for 
performance test development? 
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Performance tests should be reviewed by subject matter 
experts in industry and vocational education. What con- 
siderations should they keep in mind when reviewing these 
tests? 



P 
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Self-Check Responses 



GOAL 1 

1. Any good test should have validity , that is, it should • 
measure what it Is intended to measure. 

Any good .test should have reliability , that is, it should 
be able to identify, with a high degree of accuracy, the 
relative ^standing of any group of examinees. 

To be useful a test must have practicality in terms of 
construction, administration, and scoring. It should 
have an adequate amount of .development time budgeted, be 
capable of administration without confusion, and easy and 
straightforward to" score. 

* 

2. Planning testing procedures to accommodate .individuals 
with' special heeds ensures that all examinees have a Sair 
and equal opportunity to be tested on their skills and - 
knowledge • - - - 



GOAL 2 

1. Initial test specifications should include: 

• - Types of measures to be used and their formats 

• Total number of items 

• Total testing time 

• Skill level to be assessed by the test 

• General reading level of instructions and ques^ns^ 

2. ' Strengths of.Paper-and Pencil Tests 

• Flexibility, low cost, ease of administration and 

scoring 

• Requires no special equipment or specially trained 
staff • 

Weaknesses of Paper-and-PenciL Tests 

- — often provide little information about whether a stu- 
dent can do a task * 

• Can distort or bias assessments if reading difficulty 

exceeds job demands 
S trengths of Performance Tests 

- — can assess motor skills and interpersonal skills ' 

• Allow students to be assessed directly on what they 
are trained to perform , 
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. - Weaknesses of Performance Tests 

• Are time-consuming 

• Involve costly equipment and materials of ten ^consumed 
during testing * « 

1* Common formats of paper-and-pencil tests: 

• True-false ytems 

• . Matching items \ C 

• Completion Items ' 
Multiple-choice items 

4. -Types of performance evaluation: 

• Product evaluation 

• Process evaluation * 



GOAL 3 > • . * ' * - 

!• The "budgeting of items" is a procedure to determine * the 
number of paper-and-pencil test items to develop for each 
major test area. 

2. Considerations in initially reviewing and modifying paper- 
and-pencil test items: 

• Is the content 7 of the item logical? 

• Is .sufficient information provided? 

• Is the use and spelling of technical terms correct? 

• Is there one and only one best answer? 

• Is the correct answer keyed? y 

• Hbw difficult id the item for a typicaL-student? 

• -Can the item be improved? If. so, how? 

3. Pilot testing is a procedure used primarily to test the 
structure of the test rather than the content. It is a 
small-scale test, only involving a few students in a few 
programs. ^ ^ . 

Field testing is conducted to determine the quality.of 
the test items in terras of content. It 4 involves a large 
number .of 'students in a large number of schools. 

4. Bases for revising paper-and-pencil test items after field 

testing:* 1 

• Content 

• Difficulty' 

• Discrimination 
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Components of a performance test; 

• * Purpose 

• - Instructions to examiner 

• list of required equipment 

• Instructions to examinee 

• Rating form »- 

The major consideration in selecting, tasks* for perfor- 
mance test development is whether or not 'the skills 
required to perform the 'task can only be assessed ade- 
quately with" a performance test and not" with a paper-and- 
pencil. test. « 

Review of performance tests by subject matter experts 
should include: 

• Is the situation realistic? ' - 

• Are all'required supplies and materials: listed? If 
jiot, what is missing? 

• Are the instructions adequate? 

• Are the relevant topics- for evaluation included? If 
not, what topics are missing? 

Are the topics for evaluation listed in the qvdev in 
which they would be carried out by an examinee taking 
the test? 

• Am the materials provided sufficiently clear and com- 
piled What should be added to make them more 
appropriate? 

• Are the approximate time limits indicated in the test 
.realistic? 

• How can the test be improved? 
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